Automatic Refinement of Syntactic Categories in Chinese Word Structures
نویسنده
چکیده
Annotated word structures are useful for various Chinese NLP tasks, such as word segmentation, POS tagging and syntactic parsing. Chinese word structures are often represented by binary trees, the nodes of which are labeled with syntactic categories, due to the syntactic nature of Chinese word formation. It is desirable to refine the annotation by labeling nodes of word structure trees with more proper syntactic categories so that the combinatorial properties in the word formation process are better captured. This can lead to improved performances on the tasks that exploit word structure annotations. We propose syntactically inspired algorithms to automatically induce syntactic categories of word structure trees using POS tagged corpus and branching in existing Chinese word structure trees. We evaluate the quality of our annotation by comparing the performances of models based on our annotation and another publicly available annotation, respectively. The results on two variations of Chinese word segmentation task show that using our annotation can lead to significant performance improvements.
منابع مشابه
Keynote Speech: Lexical Semantics of Chinese Language
In this talk, we are going to give a systematic view of lexical semantics of Chinese language. From macro perspective point of view, lexical conceptual meanings are classified into hierarchical semantic types and each type plays some particular semantic functions of Host, Attribute, and Value to form a semantic compositional system. Lexical senses and their compositional functions will be exemp...
متن کاملLearning Grammar with Explicit Annotations for Subordinating Conjunctions
Data-driven approach for parsing may suffer from data sparsity when entirely unsupervised. External knowledge has been shown to be an effective way to alleviate this problem. Subordinating conjunctions impose important constraints on Chinese syntactic structures. This paper proposes a method to develop a grammar with hierarchical category knowledge of subordinating conjunctions as explicit anno...
متن کاملDesign of Chinese Morphological Analyzer
This is a pilot study which aims at the design of a Chinese morphological analyzer which is in state to predict the syntactic and semantic properties of nominal, verbal and adjectival compounds. Morphological structures of compound words contain the essential information of knowing their syntactic and semantic characteristics. In particular, morphological analysis is a primary step for predicti...
متن کاملSyntactic Function-Based Chinese Lexical Categories and Category Grammar Parsing
By merging syntactic categories of word classes, lexical categories were obtained. By demonstrating combination and type raising rules respectively from curried and uncurried perspectives, a category combination algorithm was presented, in which application, composition and type raising rules were sequentially examined, and the first available rule was selected. A Chinese CCG parser was develop...
متن کاملSixth International Joint Conference on Natural Language Processing Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing
In this talk, we are going to give a systematic view of lexical semantics of Chinese language. From macro perspective point of view, lexical conceptual meanings are classified into hierarchical semantic types and each type plays some particular semantic functions of Host, Attribute, and Value to form a semantic compositional system. Lexical senses and their compositional functions will be exemp...
متن کامل